/c-Independent Gaussians Fool Polynomial 
Threshold Functions 

Daniel M. Kane 
November 14, 2011 



1 Introduction 

In this paper we consider the ability of limited independence to fool polynomial 
threshold functions (PTFs). We recall that a (degree-d) polynomial threshold 
function is a function of the form /(x) — sgn(p(a;)) for some n-dimensional 
polynomial p of degree at most d. There has been recent interest in polynomial 
threshold functions in several areas of computer science. This paper expands on 
previous work in derandomizing polynomial threshold functions using limited 
independence. 

We say that a random variables X fools a family of functions with respect 
to some distribution Y if for every function, /, in the family 

|E[/(X)]-E[/(r)]|=0(e). 

In this paper we will be interested in the case where the family is of all degree-d 
polynomial threshold functions in n-variables, and Y is either an n-dimension 
Gaussian distribution, and in particular the case where X is an arbitrary family 
of fc-independent Gaussian random variables. In particular, we prove that 

Theorem 1. Let d > be an integer and e > a real number, then there exists 
a k = Od (^^~^ ' ; ■SO that for any degree d polynomial p and any k-independent 
family of Gaussians X and fully independent family of Gaussians Y 

\E[sgn{p{X))] - E[sgn{p{Y))]\ ^ 0(e). 

There has been a significant amount of recent work on the problem of fool- 
ing low degree polynomial threshold functions of Gaussian or Bernoulli random 
variables, especially via limited independence. It was shown in [3] that 0(e~^)- 
independence is sufficient to fool degree- 1 polynomial threshold functions of 
Bernoulli random variables, and show that this is tight up to poly logarithmic 
factors. In [?] it was shown that 0(e^^)-independence sufficed for degree-2 poly- 
nomial threshold functions of BernouUis and that 0(e~^) and 0(e~®) suffices 
for degree 1 and 2 polynomial threshold functions of Gaussians. The degree 1 
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case was also extended by [T] , who show that hmited independence fools thresh- 
old functions of polynomials that can be written in terms of a small number of 
linear polynomials. Finally, in ^ a more complicated pseudorandom generator 
for degree-d polynomial threshold functions of Bernoulli variables is developed 
with seed length 2'-^^''' log(n)e~^''~^. As far as we are aware, our paper is the 
first result to show that degree-d polynomial threshold functions are fooled by 
fc-independence for any k depending only on e and d for any d > 3. 

2 Overview 

We prove Theorem [T] first by proving our result for multilinear polynomials, and 
then finding a reduction to the general case. In particular we prove 

Proposition 2. Let d > be an integer and e > a real number, then there 
exists a k — Od * ^^'^^ /^'^ ^'^2/ degree d multilinear polynomial p : 

M" — > M and any k-independent family of Gaussians X and fully independent 
family of Gaussians Y 

\E[sgn{p{X))] - E[sgnip{Ym = 0(e). 

We define the notation A B to mean \A — B\ ~ 0{e). 

The proof of Proposition [2] will be analogous to the proof of the main The- 
orem in [4 . Our basic idea is as follows. 

In Section [3] we prove bounds on the moments of multilinear Gaussian poly- 
nomials. These results are essentially a reworking of the main result of 

In Section [4l we use these bounds to prove a structure Theorem for multi- 
linear polynomials. In particular, we prove that we can write p{X) in the form 
h{Pi(X), P2{X), . . . , Pf^{X)) where ft, is a polynomial and Pi{X) are multilinear 
polynomials with relatively small higher moments. More specifically, the poly- 
nomials Pi will be split into d different classes, with the class consisting of 
polynomials each of whose m*'' moments are 0(j(mi)™'/^. This decomposition 
allows us to write f{X) = sgn{P{X)) as sgn{h{Pi{X), . . . , Pn{X))). 

From here we make use of the FT-MoUification method (see [4] for another 
example of this technique). The basic idea will be to approximate sgn o h 
by some smooth function h, and let f{X) = h{Pi{X), . . . ,Pn{X)), which we 
do in Section [5] Our general strategy now will be to prove the sequence of 
approximations ; 

E[/(y)] E[/(r)] «e nfix)] E[/(x)]. 

The middle equality will be proved by approximation / by one of it's Taylor 
polynomials. This is a polynomial, and hence its expectation is preserved under 
limited independence. The Taylor error can again be bounded by a polynomial, 
which will have small expectation since the Pi have small moments. We cover 
this in Section ini 

The first approximation above holds roughly because / approximates / ev- 
erywhere except near places where / changes sign. The result will hold due 
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to anti-concentration results for p{Y). The last approximation similarly holds 
because of anti-concentration of p{X). Although anticoncentration of the fc- 
independent X can be proven using the above techniques applied to some other 
function g for which g is an upper bound for /, we deal with the problem in- 
directly. In particular, we show that E[/(X)] can be bounded on either side 
by E[sgn(p(y) + c)] + 0{e) for c a small constant, and use anticoncentration of 
p{Y) . We cover this in Section [7l 

Our application of FT- Mollification is complicated by the fact that our mo- 
ment bounds on the Pj are not uniform in j. To deal with this, we will construct 
h to have different degrees of smoothness in different directions, and the param- 
eter Ci will describe the amount of smoothness along the i*'' set of coordinates 
(corresponding the the class of the Pj). This forces us to come up with 
modified techniques for producing h and dealing with the Taylor polynomial 
and Taylor error. 

In Section[Sl we reduce the general case to the case of multilinear polynomials 
by approximating p{X) by a multilinear polynomial in some larger number of 
variables. 

Finally, in Section 1101 we discuss the actual requirements for k and the 
possibility of extended our results to the Bernoulli setting. 

3 Moment Bounds 

In this Section, we prove a bound on the moments of arbitrary degree-d multi- 
linear polynomials of Gaussians. Our bound is based on the main result of [3]. 
It should be noted that this result is the only reason that we restrict ourselves 
for most of this paper to the case of multilinear polynomials, as it will make our 
bound easier to state and work with. 

Throughout this Section, we will refer to two slightly different notions that 
of a multilinear polynomial and that of a multilinear form. For our purposes, 
a multilinear polynomial p{X) {X has n coordinates) will be a polynomial so 
that the degree of p with respect to any of the coordinates of X is at most 1. 
A multilinear form will be a polynomial q{X^,X^, . . . ,X"^) (here each of the 
X' may themselves have several coordinates) so that q is linear (homogeneous 
degree 1) in each of the X*. We call such a q symmetric if it is symmetric with 
respect to interchanging the X^. Finally, we note that to every homogeneous 
multilinear polynomial p of degree d, there is an associated multilinear form 
q{X^ , . . . , X'^), which is the unique symmetric multilinear form so that p{X) = 
q{X,...,X). 

Before we can state our results we need a few more definitions. 

Definition. Let p : R" — >■ R &e a homogeneous degree-d multilinear polynomial. 
Let Xi,l < i < n be independent standard Gaussians. For a integers 1 < £ < d 
define Mi{p) in the following way. Consider all possible choices of: a partition 
of {1, ... ,n} into sets Si, S2, . ■ . , Si; a sequence of integers di > 1,1 < i < £ 
so that d = X]i=i ^i' sequence of multilinear polynomials Pi,l < i < £ so that 
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Pi depends only on the coordinates in Si, pi is homogeneous of degree di, and 
E[pi{Xy'] = 1. We let Mi{p) be the supremum over all choices of Si,di,pi as 
above of 



Note that by Cauchy-Schwartz we have that Mf{p) < E[p{X)^]^/^. We now 
define a similar quantity more closely related to what is used in [6]. 

Definition. Let q : (R")"^ — > R fey a degree-d multilinear form. Let X^, I < i < 
d be independent standard n-dimensional Gaussians. For integers 1 < i < d 
define Mi{q) in the following way. Consider all possible choices of: a partition 
of {1, . . . ,d} into non-empty subsets Si, ... , Se, with Si = {c^.i, . . . , Ci^d^}; and 
a set of multilinear forms qi of degree-di with E[qi(X'^''^ , • ■ • , X'^'-'^i )'^] < 1. We 
define Mi{q) to be the supremum over all such choices of Si and qi of 



We now state the moment bound whose proof will take up the rest of this 
Section. 

Proposition 3. Let p be a homogenous degree d multilinear polynomial, and X 
a family of independent standard Gaussians, and k >2. Then 



This is essentially a version of Theorem 1 of [B] : 

Theorem ([B] Theorem 1). For q a degree-d multilinear form and X^ indepen- 
dent standard n-dimensional Gaussians and k an integer at least 2, 



Proof of Proposition[3[ The basic idea of the proof is the relate Mi{p) to Mi{q) 
and E[|p|*''] to E[|(7|'''] for q the symmetric multilinear form associated to a mul- 
tilinear polynomial p. 

Let q be the associated symmetric multilinear form associated to p. We 
claim that for each £ that Mgiji) = Qd{Mg{q)). Suppose that pi and p2 are 
degree d multilinear polynomials, and qi and q2 the associated symmetric mul- 
tilinear forms. It is easy to see (by using the standard basis of coefficients) that 
'E\pi{X)p2(X)] = d!E[gi(X\ . . . , X'^)q2(X^, X% Similarly it is easy to see 
that if p is a degree d multilinear polynomial, and pi are degree di multilinear 
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polynomials on distinct sets of coordinates, and q, their associated symmetric 
multilinear forms we have 



E 



'p{X)X{p,{X)\ = [q{X)X{q,{X(^^ 



Where q{X) = q{X^, X'^), and = . . . , X'^^+-+'^^-' 

This means that Mf{p) — Od{Mf{q)) since given the appropriate Si,di,pi we 
can use the symmetrizations of the pi to get as good a bound for Mi^q) up to a 
constant factor. To show the other direction we need to show that Mi{q) is not 
changed by more than a constant factor if we require that the qi are supported 
on disjoint sets of coordinates. But we note that if you randomly assign each 
coordinate to a qi and take the part that only depends on those coordinates, 
you loose a factor of at most d'^ on average. 
Hence we have that 



EMx\...,x''t] = eJj2M,{p)k'/' 



We just need to show that the moments of p to the moments of q are the same 
up to a factor of 0^(1)'"'. This can be shown using the main Theorem of [8] 
which in our case states that there is some constant Cd depending only on d so 
that for any such p, q and x, 



Pi{\p{X)\>x)<CdPT{\q{X\...,X'')\>x/Cd) 



and 



PT{\q{X\...,X^)\>x)<CdPr{\p{X)\>x/Cd). 
Our result follows from noting that for any random variable Y that 



kx''-^ 



Pt{\Y\ > x)dx. 



□ 



4 Structure 

In this Section, we will prove the following structure theorem for degree-d mul- 
tilinear polynomials. 

Proposition 4. Let p he a degree-d multilinear polynomial where the sum of 
the squares of its coefficients is at most 1. Let mi < m2 < ■ • • < fnd be 
integers. Then there exist integers ni, n2, • . • , n^, Ui = Od{mim2 ■ ■ ■ rrii^i) and 
non-constant, homogeneous multilinear polynomials hi,...,hd, Pi,j,^ l£ i l£ 
d, \ < j < Ui so that: 

1. hi is degree i 
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2. If Pi^ai • • • Pi,ai appears as a term in hi(Pij), then the sum of the degrees 
of the Pi.ai is d 

3. The sum of the squares of the coefficients of hi is Od(l) 

4. The sum of the squares of the coefficients ofPij is 1 

5. Each variable occurs in at most one monomial in hi 

6. IfY is a standard Gaussian and k < mi then E[\Pij{Y)\''] ~ Od{\/k)^- 

7- PiY) ^Etl WAY), P^.2{Y),■■■,P^.nAY)). 

This will allow us to write p in terms of other polynomials each with smaller 
moments. The basic idea of the proof follows from a proper interpretation of 
Proposition [S] Essentially Proposition [3] says that the higher moments of p 
will be small unless p has some significant component consisting of a product of 
polynomials Pi , . . . , of lower degree. The basic idea is that if such polynomials 
exist, we can split off these Pi as new polynomials in our decomposition, leaving 
p — Pi ■ ■ ■ Pi with smaller size than p. We repeatedly apply this procedure to 
p and all of the other polynomials that show up in our decomposition. Since 
each step decreases the size of the polynomial being decomposed, and produces 
only new polynomials of smaller degree, this process will eventually terminate. 
Beyond these ideas, the proof consists largely of bookkeeping to ensure that we 
have the correct number of P's and that they have an appropriate number of 
small moments. 

Proof. We first prove our statement for homogeneous, multilinear polynomials p. 
We reduce the general case to this one by writing p as a sum of its homogeneous 
parts and decomposing each of them. 

We would like to simply use the decomposition Pi.i = p and hi is the 
identity, but the moments of p may be too large. On the other hand, we know 
by Proposition |3] that this can only be the case if p has large correlation with 
some product of smaller degree polynomials Pi • • • P^. So if c = E[p • Pi • • • P}.]^ 
we can write p' = p — cPi ■ ■ ■ Pk- Now either p' has small moments or we 
can break off another product of polynomials. This process must eventually 
terminate because when we replaced p by p' we decreased the expectation of its 
square by c^. We will then apply this technique recursively to each of the Pi. 

We define a dot product on the space of multilinear polynomials (P, Q) = 
'E[P{Y)Q{Y)] where y is a standard Gaussian. Note that the square of the 
corresponding norm is just |Pp equals the sum of the squares of the coefficients 
of P. 

We begin by letting q ^ p. We note that by Proposition [3] that the fc*'' 
moment of q for k < mi is Od{Vk)^ unless for some 2 < I < d we have that 
Mi{q) > m™^^^/m^^^, or equivalently, unless there exist polynomials Pi, . . . , P^ 
of norm 1, so that c = {q,Pi-- ■ Pg) > m\~^^^'^ . If this is the case, we replace 
g by = q — cPi ■ ■ ■ Pg. Note that \q'\'^ — \q\^ — c? . We repeat this process 
with g' until finally we are left with a polynomial q so that for all k < mi the 
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fc*^ moment of q is Od{Vk)^ (this process must terminate since at each step we 
decrease by at least m\~'^). We now can write p as g plus a sum of Cj times 
products of lower degree polynomials. It should be noted that the sum of the 
squares of the Cj is at most 1. Letting Pi^i = q and h\ be the identity, we can 
now write 

d 

p{Y) = hiiPiAY), PiAY), • • • , Pi,n, {Y))- 

i=l 

Where \hi\ = Od(l), \Pi,j\ < 1, = Od(m^~^), and for k < mi, the k*^ moment 
of Pij is Od{Vk)^ . Unfortunately, the moments of the other P's might be too 
large. We show by induction on s that we have such a decomposition where 
all of the appropriate moments of the Pi^j for i < s are bounded and so that 
f^i = Od{m,im,2 ■ ■ ■ for all i. 

We have already proved the s = 1 case. To prove the general case, we first 
write p as J2i=i f^i{Pi,i(Y), Pi,2{Y), ■ ■ ■ , Pi,ni{Y)) using the induction hypothe- 
sis. This satisfies all of our criteria except that the Pg j might have moments 
which are too large. We fix this by rewriting each of the Psj using the same 
method we originally used to rewrite p, only guaranteing that the first nis mo- 
ments arc small. This will make it so that our new Psj have appropriately 
bounded moments, but may introduce new terms in the ht for t > s {ii some 
term shows up in multiple monomials, define several Pi^t that are equal). We 
need to make sure that we did not introduce too many new terms and that the 
sum of the squares of the coefficients is not too large. 

To show the latter note that our original procedure at most doubled the sum 
of the squares of the coefficients. Therefore applying this to each Pi in a term 
cPi ■ ■ ■ Ps will increase the sum of the squares of the coefficients by a factor of 
at most 2*. Hence since the sum of the squares of the coefficients was Od(l) 
before, it still is afterwards. 

Finally we need to show that our new decomposition did not introduce too 
many new terms. It is not hard to see that for each Pg^i we need to introduce 
Od(m*~*) new Pij terms. Therefore the total number of such new terms is 
0(m*~^ns) = Od(miTO2 • • • nit-i). 

Finally we note that our induction terminates at s = d. This is because the 
Pdj must be linear polynomials of bounded norm, and therefore automatically 
satisfy the necessary moment bounds. This completes our inductive step and 
proves the Proposition. □ 

5 FT-MoUification 

We let F be a degree-rf polynomial threshold function F = sgn(p), where p 
is a degree d multilinear polynomial in n variables whose sum of squares of 
coefficients equals 1. We pick TOi, . . . , nid (their exact sizes will be determined 
later). For later convenience, we assume the m, are all even. We then have a 
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decomposition of F given by Proposition 2] as 




f^(X)=sgn^/i,(P,,i(X), 

where Pi{X) is the vector-valued polynomial {Pi^i{X), . . . ,Pi^„. (X)), P is the 
vector of all of them, and / is the function f{Pi,...,Pd) — sgn(^ ft,,;(Pi)). 
Furthermore, we have that for k < drrii the k*^ moment of any coordinate of 
any coordinate of Pi is Od(\/fc)*^. We also have that hi is a degree i multilinear 
polynomial the sum of the squares of whose coefficients is at most 1. 

Our basic strategy now will involve approximating / by a smooth function 
/, and letting F{X) = f{P{X)). We will then proceed to prove 

E[F{Y)] E[F{Y)] E[F{X)] E[F(X)]. (1) 

We will produce / from / using the technique of mollification. Namely we will 
have f = f * p for an appropriately chosen smooth function p. However, we will 
need this p to have several other properties so we will go into some depth here 
to construct it. 

Lemma 5. Given an integer n > and a constant C , there is a function 
PC : R" ^ K so that 

1. PC > 0. 

2- J^„pcix)dx = 1. 

3. For any unit vector v G M", and any non-negative integer k, Jjj„ \D'l^pc{x)\dx < 
, where is the k^^ directional derivative in the direction v. 



I ForD>0, J^^^^^\p{x)\dx^0[{^Y'). 



Proof. We prove this for C = 2 and we note that we can obtain other values of 
C by setting pdx) = (C/2)"p2(Ca;/2). We begin by defining 

We then define 

, , , , \Bix) 

P2(X) = p{x) = 




B\2 



Where B denotes the Fourier transform of B. Clearly p is non- negative. Also 
clearly 



\B\l 



12 
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by the Planchcrcl Theorem. 

For the third property we note that 



2 



i=0 



Dl{B)Dl-'\B). 



Letting ^ be the dual vector corresponding to v we have that 



i=0 



< 



< 



< 



\B 



km 



i=0 



Dl{B)Dl-\B) 



DUB) ^ Dl-\B) 



i=0 

k 



\B 



?|2 X! 



i=0 



i=0 
2'=. 



For the last property we note that it is enough to prove that 

j \x\''p{x)dx = Oin'). 



We have that 



f 1 " 

/ \x\^ p{x)dx = -—^Y^\xiB 

^R" |-°l2 



En dB_ 
i=l Ofi 



\B 



Now 1^ is on the unit ball and outside. Hence the sum of the squares 
of these is 2|^p on |^| < 1 and outside. Hence since both numerator and 
denominator above are integrals of spherically symmetric functions, their ratio 
is equal to 



\x\^ p{x)dx 



2/q r"+i(ir 
/oV"-i(l-r2)2rfr' 
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Using integration by parts, the denominator is 



1 4 
r"-i(l-r-2)2dr = - / r'^+^l - r^)dr 
n Jo 

16 ''\n+3^^ 



n{n + 2) Jq 
16 



Hence 



n{n + 2){n + 4) 
I i2 / \ j n{n + A) 2 



□ 



We are now prepared to define /. We pick constants Ci,...,Cd (to be 
determined later). We let 

p{Pu...,Pd)=Pc^{Pi)-PcAP2)---PcAPd)- (2) 
Above the pd is defined on R"* . We let / be the convolution f = f * p- 

6 Taylor Error 

In this Section, we prove the middle approximation of Equation [T] for ap- 
propriately large k. The basic idea will be to approximate / by its Tay- 
lor series, T. T{P(X)) will be a polynomial of degree at most k and hence 
E[T(P(y))] = E[T{P{X))]. Furthermore, we will bound the Taylor error by 
some polynomial R and show that E[R{P(Y))] = E[R{P{X))] is 0{e) for ap- 
propriate choices of mi,Ci. In particular, we let T be the polynomial consisting 
of all of the terms of the Taylor expansion of / whose total degree in the Pi 
coordinates is less than rrii for all i. Note that a polynomial of this form is 
about the best we can do since we only have control over the size of moments 
up to the TO*'* moment on the i*'' block of coordinates. Our error bound will be 
the following 

Proposition 6. 



in^)-/»i<n 1 



First we prove a Lemma dealing with Taylor error for a single batch of 
coordinates, 

Lemma 7. If g is a multivariate function, g — g * pc o.nd T is the polynomial 
consisting of all terms in the Taylor expansion of g is degree less than m, then 

\g{x)-nx)\< \'\-^y^ . 



10 



Proof. Let v be the unit vector in the direction of x. Let L be the Une through 
and X. We note that the restriction of T to L is the same as the first m — 1 terms 
of the Taylor series for g\L. Using standard error bounds for Taylor polynomials 
we find that 

\g{x)-T{x)\ < — . 

But 

\DT~g\^ = \9*D:^Pc\^ 

< l5loo|i?>c|i 

< I5I00C"". 

Plugging this in yields our result. □ 

Proof of Proposition The basic idea of the proof will be to repeatedly apply 
Lemma [7] to one batch of coordinates at a time. We begin by defining some 
operators on the space of bounded functions on R"^ x R"^ x • • • x R"''. For such 
g, define 5* to be the convolution of g with pd along the i*'' set of coordinates. 
Define g^^ to be the Taylor polynomial in the z*'* set of variables of g^ obtained by 
taking all terms of total degree less than rrii . Note that for i j the operations 
i and T,; commute with the operations j and Tj since they operate on disjoint 
sets of coordinates. Note that / = /i2 -d and T = fTiT2-Ta^ For 1 < i < d let 
= andT, = /^i^- -^'. 

We prove by induction on s that 

|r.(P)-/,(P)|<n 1+ ^ ' , -1- 

As a base case, we note that the s = case of this is trivial. 
Assume that 



iT,(p)-/,(p)i<n(i 



i=l 



We have that 



|T,+i(F)-/,+i(P)| 

< |rj-n^) - T!+\p)\ + \T!+\p) - ff\p)\. 

Note that 

Therefore since s + 1 involves only convolution with a function of norm 1 we 
have that 

\Tf\P) ff\P)\ < \UP) - /,(P)U,.+i 
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where the subscript denotes the L°° norm over just the s + l"* set of coordinates. 
By the inductive hypothesis, this is at most 



On the other hand, applying Lemma [7] we have that 

^ s4 

I l-^s|oo,s+l 



|Tj=+i(P) - T:+\P)\ < '^+1' IT, 



By the inductive hypothesis, 

|-^s|oo,s+l — l/s|oo,s+l ^ /s|oo,s+l 

Combining the above bounds, we find that 

iT.,.-/.,.i<nfi+^^)-i 



i=l 



cT+r \Ps+i r=+: A A ^ cn^.r- 

J. X \ TT7 ■ I 



□ 



We can now prove the desired approximation resuh 



Proposition 8. If F,P,T as above with nii — ftdiriiCf), rrii > log(2''/e) for 
all i, and if k > drrii for all i, then for X and Y are k-independent families of 
standard Gaussians, 

E[F{Y)] E[F{X)]. 

Proof. We note that since T o P is a polynomial of degree at most k we have 
that E[r(F(X))] = E[T(P(r))]. Hence, it suffices to show that 

E[|P - r|(p(x))],E[|P - r|(p(r))] = o{e). 

We will show this only for X as F is analogous. By Proposition |6] we have that 
|P - r| is bounded by 



mi 
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This is a sum over non-empty subsets S* C {1, 2, . . . , d} of 



m 



Since there are only 2'* — 1 such S, it is enough to show that each term individ- 
uahy has expectation 0{e/2'^). On the other hand, we have by AM-GM that 
each term is at most 

1 ^ /C™'|P,I™'^ 



1^1 § 



Now the expectation of is at most n™'''^' times the average of the 

mi\S\*'^ moments of the coordinates of P^. These by assumption are Od{y^mi\S\)" 
There are coordinates so the moment of \Pi\ is at most Od(v^riim7|S^)™'''^' . 
Hence the error is at most 



0{2'^) max <^ O, 



Ciy/niniiS 



= 0(2'^) max jOd 

i.s I 



^, I \ rriiS 



^d\^~ mini 



< 0(2'')e-™-'"' =0(e). 

□ 



7 Approximation Error 

In this Section, we will prove the first and third approximations in Equation [TJ 
We begin with the first, namely 

E[F{Y)] «.E[F(r)]. 

Our basic strategy will be to bound 

|E[F(r)] - E[F{Y)] < E[\F{Y) - F{Y)\]. 

In order to get a bound on this we will first show that F — F is small except 
where p{Y) is small, and then use anti-concentration results to show that this 
happens with small probability. This will be true because p is small away from 
0. We begin by proving a Lemma to this effect. 

Lemma 9. Let p be the function defined in Equation\^ Then for any D > 
we have that 

3i:\xi\>Dni^/d/Ci 

This will hold essentially because of the concentration property held by each 
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Proof. We integrate over the region where \xi\ > Dni\/d/Ci for each i. This is 
a product over j ^ i of PCj{xj) times /|^|>£,„^y3/c, |p(a;)|da;. By Lemma 
[9]the former integrals are ah 1, and the latter is 0{D~^ /d). Summing over all 
possible i yields 0{D^'^). □ 

Recall that / was sgn o where h = "Y^hi given in the decomposition of 
p from Proposition |4l Recall that f — f * p. We want to bound the error in 
approximating / by /. The following, is a direct consequence of Lemma |9l 

Lemma 10. Suppose x = {xi , . . . ,Xd) G M"^ x • • • x R"'' . Suppose also that for 
some D > and for all y — {yi, . . . , yd) G M"^ x • • • x R"'' so that \xi — yi\ < 
Dui^/d/Ci that h(x) and h{y) have the same sign, then 

\f{x)~f{x)\^0{mm{l,D-'}). 

Proof. To show that the error is 0(1), we note that since p > and / p{x)dx — 1 
that fix) = {f*p){x) e [inf(/),sup(/)] C [-1,1]. Therefore |/-/| < |/| + |/| < 
2. 

For the latter, we note that f{x) — f{y)p{x — y)dy. We note that since 
the total integral of p is 1 that 

fix) - fix) = / ifix) - fiy))pix - y)dy. 

We note that by assumption unless \xi — yi\ > Dui^/d/d for some i that the 
integrand is 0. But outside of this, the integrand is at most 2p(a; — y). By 
Lemma [5] the total integral of this is 0(D~^). □ 

We now know that / is near / at points x not near the boundary between the 
+ 1 and —1 regions. Since we cannot directly control the size of these regions, we 
want to relate this to the region where is small. This should work since 

unless X is very large, h will have derivatives that aren't too big. In particular, 
we prove the following. 

Lemma 11. Let x e M". Suppose that we have Bi > so that \Pi,jix)\ < Bi 
for all We have that 1-^(2;) — Fix)\ is at most the minimum of Oil) and 

Proof. The bound of 0(1) follows immediately from Lemma [TUl For the other 
bound, let 

n • [ b(^)l . 

D — mm < 3 , mm < ;= > > . 
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By LemmafTUl it suffices to show that for any Q = {Qi, ■ ■ ■ ,Qn) & x • • • x M""^ 
so that \Qi — Pi{x)\ < DriiVd/Ci that h{P{x)) = p[x) and h{Q) have the same 
sign. To do this, we write h — hi + ■ ■ ■ + hd and we note that 

d 

\h{p{x)) - hm < m^) - Q^\mz)\■ 

i=l 

Where h'^{z) is the directional derivative of hi in the direction from Pi{x) to 
Qi, and z is some point along this line. First, note that \Qi — Pi{x)\ < Bi. 
Therefore, each coordinate of z is at most 2Bi. Note that hi is a sum of at 
most Tii monomials of degree i with coefficients at most 1. The derivative of 
each monomial at z is at most \/d2'^_Bj~^. Therefore, < ^fd2'^niBl~^ . 

Therefore, 

d 

i=l 
d 

< J2iDn,Vd/C,){Vd2''n,Bl~^) 

i=l 

d 

< DY,d2'^nlB'-^/C, 

i=l 

< \HP{x))\. 

Therefore h{P{x)) and h{Q) have the same sign, so our bound follows by Lemma 

m □ 

We take this bound on the approximation error and prove the following 
Lemma on the error of expectations. 

Lemma 12. Let Z be a random variable valued in M". Let Bi > 1 be real 
numbers. Let M = J2t=i niBi~^/Ci. Then 

\E[F{Z)]-E[F{Z)]\^ 

Od{Pr{3i,] : \P,,j{Z)\ > Bi) + M + Pr{\p{Z)\ < Vm)). 
Furthermore, 

E[FiZ)] <E[HZ)] 

+ Od{Pr{3iJ : \P^,j{Z)\ > B,) + M) 

+ 2Pr(-VM < p{Z) < 0), 

and 

E[F{Z)] >E[F{Z)] 

+ Od{Pr{3i,j : \P,,,{Z)\ > B,) + M) 
- 2Pr{0 <p{Z) < Vm). 
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Proof. We note that \F{Z) - F{Z)\ = 0(1). Also note that < ^ for 
all i. The first inequality follows by noting that Lemma 1 11 1 implies that miless 
\Pij{Z)\ > Bi for some i,j that the following hold: 

1. If \p{z)\<^/M, \F{Z)-F{Z)\<2. 

2. If \p(z)\ > VM, |F(Z) - F{Z)\ - OdiM). 

The other two inequalities follow from noting that if p{z) < 0, then F{Z) < 
F{Z) and if p{Z) > then F{Z) > F{Z). □ 

We are almost ready to prove the first of our approximation results, but 
we first need a theorem on the anticoncentration of Gaussian polynomials. In 
particular a consequence of f2| Theorem 8 is: 

Theorem 13 (Carbery and Wright). Let p be a degree d polynomial, and Y a 
standard Gaussian. Suppose that E[p{Y)'^] — 1. Then, for e > 0, 

Pr{\p{Y)\ < e) = 0{de^/''). 

We are now prepared to prove our approximation result. 

Proposition 14. Let p,F, F,h,mi,ni,Ci be as above and let e > 0. Let Bi = 

fld{\/^og{ni/e)) be some real numbers. Suppose that rui > Bf and that d = 
fld{nf Bl~^ e^^'^) for all i. Then, if the implied constants for the bounds on Bi 
and Ci are large enough, 

\E[F{Y)]-E[F{Y)]\^0{e). 

Proof. We bound the error using Lemma [T^ We note that the probability that 
> Bi can be bounded by looking at the \og{dni/e) = fc*'' moment, 

yielding a probability of '^''''"^^ < — Taking a union bound over all j 
gives a probability of ^ . Taking a union bound over i yields a probability of at 
most e. 

Next we note that 

d 

M = Y,njBr'/C,^0,{e''). 

Hence if our constants were chosen to be large enough, by Theorem 1131 

Pr(|p(r)| < Vm) = 0(6). 

This proves our result. □ 

If we could prove Proposition [14] for X instead of Y , we would be done. Un- 
fortunately, Theorem[T3]does not immediately apply for families that are merely 
fc-independent. Fortunately, we can work around this to prove Proposition [5J 
In particular, we will use the inequality versions of Lemma [12] to obtain upper 
and lower bounds on E[F(X)] in terms of E[sgn(p(y) + c)], and make use of 
anticoncentration for p{Y^. 
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Proof of Proposition^^ Let Bi = fld{\/\og{l/e)) with sufficiently large con- 
stants^ Define n,. and C. » that C. ^ {(jT^ B.- and > 

fid (^(nj=i"^i) C'fj , log(2'*/e), all with suSiciently large constants. Note 
that this is achievable by setting Ci — fid (^^^ '^^ : '^j — ^rf ^e"'^'^ '^^ . Let 
k — dmaxi mi. Note k can be as small as Odi^^'^'^'''"')- Using these parameters, 
define ni,hi, Pi,j, f, /, F as described above. Note that since rii — Od (llj^i "^i) 

that C, = QdinfBl-h-^'^) and m, = naiuiCf). Therefore, for Y a family of 
independent standard Gaussians and X a family of fc-independent standard 
Gaussians, Propositions [51 and 1141 implv that 

E[F{Y)] E[F{Y)] E[F{X)]. 

We note that the M in Lemma [T2] is Odie^"^) with sufficiently small constant. 
Therefore, by Lemma [T1|E[F(X)] - E[F{X)]\ is at most 

0(e) +2Pr(|p(^)| < Od(e'^)) +Pr(3*,j : \P,,,{X)\ > B,). 

We note that by looking at the log(dni/e) moments of the Pi j that the last 
probability is 0(e). Therefore, combining this with the above we get that 

E[F(X)] > E[F(r)] + 0(e) - 2Pr(0 < p{X) < Odie")), 

and 

E[FiX)] < E[FiY)] + 0(e) + 2Pr(-Od(e'') < piX) < 0). 
But this implies that 

E[sgn(p(X) - Od{e'))] < E[^(y)] + 0(e), 

and 

E[sgn(p(X) + Od{e^))] > E[i^(y)] + 0(e). 
On the other hand, applying to above to the polynomials p ± Od{e^), 

E[sgn(p(y) - Odie"))] + 0(e) < E[^^(X)] 

<E[sgn(p(y)+Od(e'^))]+0(e). 

But we have that 

E[sgn(p(y) - Od(e'^))] < E[^^(y)] < E[sgn(p(r) + Odie-"))]. 

Furthermore, sgn(p(F) — Od{(-''')) and sgn{piY) + 0(j(e'*)) differ by at most 2, 
and only when |p(y)| = Od{e'^). By Theorem [T51 this happens with probability 
Od(e). Therefore, we have that all of the expectations above are within Od{e) 
of E[F(F)], and hence E[F(X)] = E[F(r)] + Od(e). Decreasing the value of e 
by a factor depending only on d (and increasing fc by a corresponding factor) 
yields our result. □ 
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8 General Polynomials 



We have proved our Theorem for muhihnear polynomials, but would like to 
extend it to general polynomials. Our basic idea will be to show that a gen- 
eral polynomial is approximated by a multilinear polynomial in perhaps more 
variables. 

Lemma 15. Letp be a degree d polynomial and ^ > 0. Then there exists a mul- 
tilinear degree d polynomial pg ( in perhaps a greater number of variables ) so that 
for every k-independent family of random Gaussians X, there is a (correlated) 
k-independent family of random Gaussians X so that 

Pr{\p{X) - ps{X)\ > S) < S. 

Proof. We will pick some large integer N (how large we will say later). If 
X = {Xi,..., Xn), we let X = (Xij), 1 <i < n,l < j < N. For fixed i we let 
the collection of Xij be the standard collection of N standard Gaussians subject 
to the condition that Xi = Y^j'=i -^i-j- Equivalently, Xij = -^Xi + Yi_j 
where the Y^j are Gaussians with variance 1 — and covariance —1/N with 
each other. 

X is fc-independent because given any ii, . . . ,ik, ii, • • • ,jk we can obtain 
the Xigjg by first picking the Xi^ randomly and independently, and picking the 
Yi^j^ independently of those. But we note that this yields the same distribution 
we would get by setting all of the Xi^^k to be random independent Gaussians, 
and letting X^ = ^ ^f^i ^hi- 

We now need to construct ps with the appropriate property. The idea will be 
to replace each term Xf in each monomial in p with some degree k polynomial 
in the Xij. This will yield a multilinear degree d polynomial in X. We will 
want this new polynomial to be within S' of X!^ with probability 1 — ^' for 6' 
some small positive number depending on p and S. This will be enough since 
if S' < 5/{2dn) the approximation will hold for all i,k with probability at least 
1 — 5/2. Furthermore with probability 1 — 5/2, each of the \Xi\ will be at most 
0{\og{n/ 5)). Therefore if this holds and each of the replacement polynomials 
is off by at most 5' , then the value of the full polynomial will be off by at most 
0{\og'^{n/5)5') times the sum of the coefficients of p. Hence if we can achieve 
this for 5' small enough we arc done. 

Hence, we have reduced our problem to the case of p{X) = p{Xi) = Xf. 
For simplicity of notation, we use X instead of Xi and Xj instead of Xi^j. We 
note that 

\i=i 

Unfortunately, this is not a multilinear polynomial in the Xj. Fortunately, 
it almost is. Expanding it out and grouping terms based on the multiset of 
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exponents occurring in them we find that 



ii<...<afc 
Z; ai=d 



^ai,a2, ■ ■ ■ ,aky . 

ai<...<a/^ o J — i 



Where S is the set of ii, . . . , ifc G {!,..., A^} distinct so that ij < ij+i if aj = 
ttj+i. Letting bi be the number of a.; that are equal to £ we find that this is 

ai<...<afc ^ f ij_...^i^.g[Ar] j = l 

Y^o-i—d ij distinct 



Or rewriting shghtly, this is 



d \ T-r 1 '^r- A fx. 



Ho E n 



ai<...<afc ^ ' £ ii,...,ifce[-/V] J = l 

^ai— d distinct 



Now, with probability 1 — S, 



= 0(log(l/5)). Furthermore with proba- 
bility tending to 1 as iV goes to infinity, (j^r (^)^^ = ^ + 0(6/ \og'^{l/d)), 

and (Xlz (^)°) = 0{S/\og'^{l/6)) for each 3 < a < d. If aU of these events 
hold, then each term in the above with some aj > 2 will be 0{6), and any terms 
with some aj = 2 will be within 0(6) of 

k' y 

ii,...,ile{i,...,JV}i=i ^ 

ij distinct 

where k' is the largest j so that aj — 1. This gives a multilinear polynomial, 
that with probability 1 — (5 is within Od{5) of p{X). Perhaps decreasing 5 to 
deal with the constant in the Od yields our result. □ 

We can now prove Theorem [TJ 

Proof of Theorem d Let p be a normalized degree d polynomial. Let k be as 
required by Proposition[21 Let K be a family of independent standard Gaussians 
and X a fc-independent family of standard Gaussians. Fix 5 — (e/d)'^. Let 
ps,X,Y be as given by Lemma [151 We need to show that Pt{p{X) > 0) = 
Pt{p{Y) > 0) + 0(e). By construction of ps, 

Pt{p{X) > 0) > Pr(p5(^) >S)~S. 

Applying Proposition [2] to the multilinear polynomial ps — S, this is at least 

Piips{Y)>S)+0{e). 
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Since Y is ^-independent for all £ (since Y is) , it is actually an independent family 
of Gaussians. Therefore by Theorem [H Pt{\p(Y)\ < S) = 0{d5^l'^) = 0(e). 
Hence 

Pr(p(X) > 0) > Pt{ps{Y) > ~S) + 0(e). 

Noting that with probability I — S that ps{Y) is at most 6 less than p{Y), this 
is at least 

Pr(p(y) >0) + O(e). 

So 

Pr(p(X) > 0) > Pt{p{Y) > 0) + 0(e). 

Similarly, 

Pt{p{X) < 0) > Pr(p(r) < 0) + 0{e). 
Combining these we clearly have 

Pt{p{X) > 0) = Pt{p{Y) > 0) + 0(e) 

as desired. □ 



9 Fooling PTFs of Bernoulli Random Variables 

Theorem [1] should also hold when X is a /c-independent family of Bernoulli 
random variables and K is a fully independent family of Bernoulli random vari- 
ables. The proof is essentially the same as in the Gaussian case with a few 
minor changes that need to be made. In particular, the following steps do not 
carry over immediately: 

1 . The reduction from the case of a general polynomial to that of a multilinear 
polynomial 

2. Theorem 1131 does not hold for Bernoulli random variables 

3. Theorem [3] is not stated for the Bernoulli case 

The first of these problems is even easier to deal with in the Bernoulli case 
than in the Gaussian case. This is because any degree-d polynomial is equal to 
some degree-d multilinear polynomial on the hypercube. 

The second of these problems can be dealt with by fairly standard means. 
In particular, the Invariance Principle of |7i implies that for sufficiently regular 
polynomials, p, that p{X) is anticoncentrated even for X a Bernoulli random 
variable. We are still left with the problem of reducing ourselves to the case of 
a regular polynomial. This would be done using a regularity Lemma similar to 
that proven in [5 , showing that an arbitrary polynomial threshold function can 
be written as a decision tree on a small number of coordinates such that most of 
the leaves are approximated by regular polynomial threshold functions. Given 
a slight modification of this result telling us that these "approximations" hold 
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even on fc-independent inputs would allow us to reduce to the case of a regular 
polynomial after determining the values of Od(e~'^^''^) coordinates. 

The last of these concerns is apparently more significant, but can be dealt 
with by proving that Theorem [3] does hold for polynomials of Bernoullis. In 
particular, one can show that a higher moment of a polynomial with respect to 
the Bernoulli distribution can be bounded in terms of the corresponding moment 
with respect to the Gaussian distribution. In particular, we show that: 

Lemma 16. Letp be a homogeneous degree-d multilinear polynomial and k > 1. 
Let X be a Bernoulli random variable and Y a Gaussian random variable. Then 

EMXt] = 0{irEMY)n 

Proof (Thanks to Jelani Nelson). Let a = (cri, . . . , cr„) be an n-dimensional 
Bernoulli random variable and G = {gi, . . . , gn) an n-dimensional Gaussian ran- 
dom variable independent of of a. Note that ai\gi\ is distributed as a Gaussian. 
Therefore we have that 

E[|p(G)|^-] = E[b(ai|5i|, . . = EG[E,[b(fTi|gi|, . . .,aMt]]- 

By the convexity of the L'^ norm this is at least 

E, 

On the other hand, we have that 



\EG[p{cri\gi\, ■ ■ ■ ,Crn\gn\)]\'' 



Therefore we have that 

— dk 

As desired. □ 



10 Conclusion 

The bounds on k presented in this paper are far from tight. At the very least 
the argument in Lemma [12] could be strengthened by considering a larger range 
of cases of \p{x)\ rather than just whether or not it is larger than \fM. At 
very least, this would give us bounds on k of the form Odi^"^ ) for some x less 
than 7. I suspect that the correct value of k is actually 0{d^£^^), and in fact 
such large k will actually be required for p{x) — Y[i=iiJ2j=i ^i.j)- the other 
hand, this bound is at the moment somewhat beyond our means. It would be 
nice at least to see if a bound of the form k — Od{e~^°^^^'^^) can be proven. 
The main contribution of this work is prove that there is some sufficient k that 
depends on only d and e. 
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